33 research outputs found
Cyber-Deception and Attribution in Capture-the-Flag Exercises
Attributing the culprit of a cyber-attack is widely considered one of the
major technical and policy challenges of cyber-security. The lack of ground
truth for an individual responsible for a given attack has limited previous
studies. Here, we overcome this limitation by leveraging DEFCON
capture-the-flag (CTF) exercise data where the actual ground-truth is known. In
this work, we use various classification techniques to identify the culprit in
a cyberattack and find that deceptive activities account for the majority of
misclassified samples. We also explore several heuristics to alleviate some of
the misclassification caused by deception.Comment: 4 pages Short name accepted to FOSINT-SI 201
Tools and Experiments for Software Security
The computer security problems that we face begin in computer programs that we write.
The exploitation of vulnerabilities that leads to the theft of private
information and other nefarious activities often begins with a vulnerability
accidentally created in a computer program by that program's author. What
are the factors that lead to the creation of these vulnerabilities? Software
development and programming is in part a synthetic activity that we can
control with technology, i.e. different programming languages and software
development tools. Does changing the technology used to program software help
programmers write more secure code? Can we create technology that will
help programmers make fewer mistakes?
This dissertation examines these questions. We start with the Build It
Break It Fix It project, a security focused programming competition. This project
provides data on software security problems by allowing contestants
to write security focused software in any programming language. We discover that
using C leads to memory safety issues that can compromise security.
Next, we consider making C safer. We develop and examine the Checked C
programming language, a strict super-set of C that adds types for spatial safety.
We also introduce an automatic re-writing tool that can convert C code into Checked C
code. We evaluate the approach overall on benchmarks used by prior work on making
C safer.
We then consider static analysis. After an examination of different parameters of
numeric static analyzers, we develop a disjunctive abstract domain that uses a
novel merge heuristic, a notion of volumetric difference, either approximated via
MCMC sampling or precisely computed via conical decomposition. This domain is
implemented in a static analyzer for C programs and evaluated.
After static analysis, we consider fuzzing. We consider what it takes to perform
a good evaluation of a fuzzing technique with our own experiments and a review of
recent fuzzing papers. We develop a checklist for conducting new fuzzing research
and a general strategy for identifying root causes of failure found during fuzzing.
We evaluate new root cause analysis approaches using coverage information as
inputs to statistical clustering algorithms
The Human Phenotype Ontology project:linking molecular biology and disease through phenotype data
The Human Phenotype Ontology (HPO) project, available at http://www.human-phenotype-ontology.org, provides a structured, comprehensive and well-defined set of 10,088 classes (terms) describing human phenotypic abnormalities and 13,326 subclass relations between the HPO classes. In addition we have developed logical definitions for 46% of all HPO classes using terms from ontologies for anatomy, cell types, function, embryology, pathology and other domains. This allows interoperability with several resources, especially those containing phenotype information on model organisms such as mouse and zebrafish. Here we describe the updated HPO database, which provides annotations of 7,278 human hereditary syndromes listed in OMIM, Orphanet and DECIPHER to classes of the HPO. Various meta-attributes such as frequency, references and negations are associated with each annotation. Several large-scale projects worldwide utilize the HPO for describing phenotype information in their datasets. We have therefore generated equivalence mappings to other phenotype vocabularies such as LDDB, Orphanet, MedDRA, UMLS and phenoDB, allowing integration of existing datasets and interoperability with multiple biomedical resources. We have created various ways to access the HPO database content using flat files, a MySQL database, and Web-based tools. All data and documentation on the HPO project can be found online
The impact of digital start-up founders’ higher education on reaching equity investment milestones
This paper builds on human capital theory to assess the importance of formal education among graduate entrepreneurs. Using a sample of 4.953 digital start-ups the paper evaluates the impact of start-up founding teams’ higher education on the probability of securing equity investment and subsequent exit for investors. The main findings are: (1), teams with a founder that has a technical education are less likely to remain self-financed and are more likely to secure equity investment and to exit, but the impact of technical education declines with higher level degrees, (2) teams with a founder that has doctoral level business education are less likely to remain self-financed and have a higher probability of securing equity investment, while undergraduate and postgraduate business education have no significant effect, and (3) teams with a founder that has an undergraduate general education (arts and humanities) are less likely to remain self-financed and are more likely to secure equity investment and exit while postgraduate and doctoral general education have no significant effect on securing equity investment and exit. The findings enhance our understanding of factors that influence digital start-ups achieving equity milestones by showing the heterogeneous influence of different types of higher education, and therefore human capital, on new ventures achieving equity milestones. The results suggest that researchers and policy-makers should extend their consideration of universities entrepreneurial activity to include the development of human capital
The Consensus Coding Sequence (Ccds) Project: Identifying a Common Protein-Coding Gene Set for the Human and Mouse Genomes
Effective use of the human and mouse genomes requires reliable identification of genes and their products. Although multiple public resources provide annotation, different methods are used that can result in similar but not identical representation of genes, transcripts, and proteins. The collaborative consensus coding sequence (CCDS) project tracks identical protein annotations on the reference mouse and human genomes with a stable identifier (CCDS ID), and ensures that they are consistently represented on the NCBI, Ensembl, and UCSC Genome Browsers. Importantly, the project coordinates on manually reviewing inconsistent protein annotations between sites, as well as annotations for which new evidence suggests a revision is needed, to progressively converge on a complete protein-coding set for the human and mouse reference genomes, while maintaining a high standard of reliability and biological accuracy. To date, the project has identified 20,159 human and 17,707 mouse consensus coding regions from 17,052 human and 16,893 mouse genes. Three evaluation methods indicate that the entries in the CCDS set are highly likely to represent real proteins, more so than annotations from contributing groups not included in CCDS. The CCDS database thus centralizes the function of identifying well-supported, identically-annotated, protein-coding regions.National Human Genome Research Institute (U.S.) (Grant number 1U54HG004555-01)Wellcome Trust (London, England) (Grant number WT062023)Wellcome Trust (London, England) (Grant number WT077198
Fc Effector Function Contributes to the Activity of Human Anti-CTLA-4 Antibodies.
With the use of a mouse model expressing human Fc-gamma receptors (FcγRs), we demonstrated that antibodies with isotypes equivalent to ipilimumab and tremelimumab mediate intra-tumoral regulatory T (Treg) cell depletion in vivo, increasing the CD8+ to Treg cell ratio and promoting tumor rejection. Antibodies with improved FcγR binding profiles drove superior anti-tumor responses and survival. In patients with advanced melanoma, response to ipilimumab was associated with the CD16a-V158F high affinity polymorphism. Such activity only appeared relevant in the context of inflamed tumors, explaining the modest response rates observed in the clinical setting. Our data suggest that the activity of anti-CTLA-4 in inflamed tumors may be improved through enhancement of FcγR binding, whereas poorly infiltrated tumors will likely require combination approaches
Finishing the euchromatic sequence of the human genome
The sequence of the human genome encodes the genetic instructions for human physiology, as well as rich information about human evolution. In 2001, the International Human Genome Sequencing Consortium reported a draft sequence of the euchromatic portion of the human genome. Since then, the international collaboration has worked to convert this draft into a genome sequence with high accuracy and nearly complete coverage. Here, we report the result of this finishing process. The current genome sequence (Build 35) contains 2.85 billion nucleotides interrupted by only 341 gaps. It covers ∼99% of the euchromatic genome and is accurate to an error rate of ∼1 event per 100,000 bases. Many of the remaining euchromatic gaps are associated with segmental duplications and will require focused work with new methods. The near-complete sequence, the first for a vertebrate, greatly improves the precision of biological analyses of the human genome including studies of gene number, birth and death. Notably, the human enome seems to encode only 20,000-25,000 protein-coding genes. The genome sequence reported here should serve as a firm foundation for biomedical research in the decades ahead